On Summarizing Graph Streams

نویسندگان

  • Nan Tang
  • Qing Chen
  • Prasenjit Mitra
چکیده

Graph streams, which refer to the graph with edges being updated sequentially in a form of a stream, have wide applications such as cyber security, social networks and transportation networks. This paper studies the problem of summarizing graph streams. Specifically, given a graph stream G, directed or undirected, the objective is to summarize G as SG with much smaller (sublinear) space, linear construction time and constant maintenance cost for each edge update, such that SG allows many queries over G to be approximately conducted efficiently. Due to the sheer volume and highly dynamic nature of graph streams, summarizing them remains a notoriously hard, if not impossible, problem. The widely used practice of summarizing data streams is to treat each element independently by e.g., hashor sampling-based method, without keeping track of the connections between elements in a data stream, which gives these summaries limited power in supporting complicated queries over graph streams. This paper discusses a fundamentally different philosophy for summarizing graph streams. We present gLava, a probabilistic graph model that, instead of treating an edge (a stream element) as the operating unit, uses the finer grained node in an element. This will naturally form a new graph sketch where edges capture the connections inside elements, and nodes maintain relationships across elements. We discuss a wide range of supported graph queries and establish theoretical error bounds for basic queries.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SBG-Sketch: A Self-Balanced Sketch for Labeled-Graph Stream Summarization

Applications in various domains rely on processing graph streams, e.g., communication logs of a cloud-troubleshooting system, roadnetwork traffic updates, and interactions on a social network. A labeled-graph stream refers to a sequence of streamed edges that form a labeled graph. Label-aware applications need to filter the graph stream before performing a graph operation. Due to the large volu...

متن کامل

From Big Bang to Big Crunch

A graph stream, which refers to the graph with edges being updated sequentially in a form of a stream, has important applications in cyber security and social networks. Due to the sheer volume and highly dynamic nature of graph streams, the practical way of handling them is by summarization. Given a graph stream G, directed or undirected, the problem of graph stream summarization is to summariz...

متن کامل

Summarizing World Speak : A Preliminary Graph Based Approach

Social media platforms play a crucial role in piecing together global news stories via their corresponding online discussions. Thus, in this work, we introduce the problem of automatically summarizing massively multilingual microblog text streams. We discuss the challenges involved in both generating summaries as well as evaluating them. We introduce a simple word graph based approach that util...

متن کامل

Summarizing Massive Information for Querying Web Sources and Data Streams

OF THE DISSERTATION Summarizing Massive Information for Querying Web Sources and Data Streams

متن کامل

Summarizing Multidimensional Data Streams: A Hierarchy-Graph-Based Approach

With the rapid development of information technology, many applications have to deal with potentially infinite data streams. In such a dynamic context, storing the whole data stream history is unfeasible and providing a high-quality summary is required for decision makers. In this paper, we propose a summarization method for multidimensional data streams based on a graph structure and taking ad...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1510.02219  شماره 

صفحات  -

تاریخ انتشار 2015